Goto

Collaborating Authors

 posterior consistency



A Dynamics-Informed Gaussian Process Framework for 2D Stochastic Navier-Stokes via Quasi-Gaussianity

Hamzi, Boumediene, Owhadi, Houman

arXiv.org Machine Learning

Yet a fundamental gap remains: while these methods depend critically on the choice of prior covariance kernel, most kernels are selected for computational convenience (e.g., Gaussian/RBF kernels) or generic smoothness assumptions (e.g., Mat ern) rather than being rigorously grounded in the system's long-time statistical structure. Recent breakthroughs in stochastic PDE theory now make it possible to close this gap, constructing priors directly from the invariant-measure geometry of the underlying dynamics. Recent work of Coe, Hairer, and Tolomeo [7] establishes a remarkable geometric property of the two-dimensional stochastic Navier-Stokes (2D SNS) equations: although the dynamics are highly nonlinear, their unique invariant measure is equivalent-in the sense of mutual absolute continuity-to the Gaussian invariant measure of the linearized Ornstein-Uhlenbeck (OU) process. Equivalence means the two measures share the same support, null sets, and typical events, differing only by a positive Radon-Nikodym derivative. This reveals that the equilibrium statistical geometry is Gaussian, even when individual realizations are not.



Bayesian Invariance Modeling of Multi-Environment Data

Wu, Luhuan, Yin, Mingzhang, Wang, Yixin, Cunningham, John P., Blei, David M.

arXiv.org Machine Learning

Invariant prediction [Peters et al., 2016] analyzes feature/outcome data from multiple environments to identify invariant features - those with a stable predictive relationship to the outcome. Such features support generalization to new environments and help reveal causal mechanisms. Previous methods have primarily tackled this problem through hypothesis testing or regularized optimization. Here we develop Bayesian Invariant Prediction (BIP), a probabilistic model for invariant prediction. BIP encodes the indices of invariant features as a latent variable and recover them by posterior inference. Under the assumptions of Peters et al. [2016], the BIP posterior targets the true invariant features. We prove that the posterior is consistent and that greater environment heterogeneity leads to faster posterior contraction. To handle many features, we design an efficient variational approximation called VI-BIP. In simulations and real data, we find that BIP and VI-BIP are more accurate and scalable than existing methods for invariant prediction.


Posterior Consistency for Missing Data in Variational Autoencoders

Sudak, Timur, Tschiatschek, Sebastian

arXiv.org Machine Learning

We consider the problem of learning Variational Autoencoders (VAEs), i.e., a type of deep generative model, from data with missing values. Such data is omnipresent in real-world applications of machine learning because complete data is often impossible or too costly to obtain. We particularly focus on improving a VAE's amortized posterior inference, i.e., the encoder, which in the case of missing data can be susceptible to learning inconsistent posterior distributions regarding the missingness. To this end, we provide a formal definition of posterior consistency and propose an approach for regularizing an encoder's posterior distribution which promotes this consistency. We observe that the proposed regularization suggests a different training objective than that typically considered in the literature when facing missing values. Furthermore, we empirically demonstrate that our regularization leads to improved performance in missing value settings in terms of reconstruction quality and downstream tasks utilizing uncertainty in the latent space. This improved performance can be observed for many classes of VAEs including VAEs equipped with normalizing flows.


Monte Carlo inference for semiparametric Bayesian regression

Kowal, Daniel R., Wu, Bohan

arXiv.org Machine Learning

Data transformations are essential for broad applicability of parametric regression models. However, for Bayesian analysis, joint inference of the transformation and model parameters typically involves restrictive parametric transformations or nonparametric representations that are computationally inefficient and cumbersome for implementation and theoretical analysis, which limits their usability in practice. This paper introduces a simple, general, and efficient strategy for joint posterior inference of an unknown transformation and all regression model parameters. The proposed approach directly targets the posterior distribution of the transformation by linking it with the marginal distributions of the independent and dependent variables, and then deploys a Bayesian nonparametric model via the Bayesian bootstrap. Crucially, this approach delivers (1) joint posterior consistency under general conditions, including multiple model misspecifications, and (2) efficient Monte Carlo (not Markov chain Monte Carlo) inference for the transformation and all parameters for important special cases. These tools apply across a variety of data domains, including real-valued, integer-valued, compactly-supported, and positive data. Simulation studies and an empirical application demonstrate the effectiveness and efficiency of this strategy for semiparametric Bayesian analysis with linear models, quantile regression, and Gaussian processes.


A Bayesian sparse factor model with adaptive posterior concentration

Ohn, Ilsang, Lin, Lizhen, Kim, Yongdai

arXiv.org Artificial Intelligence

In this paper, we propose a new Bayesian inference method for a high-dimensional sparse factor model that allows both the factor dimensionality and the sparse structure of the loading matrix to be inferred. The novelty is to introduce a certain dependence between the sparsity level and the factor dimensionality, which leads to adaptive posterior concentration while keeping computational tractability. We show that the posterior distribution asymptotically concentrates on the true factor dimensionality, and more importantly, this posterior consistency is adaptive to the sparsity level of the true loading matrix and the noise variance. We also prove that the proposed Bayesian model attains the optimal detection rate of the factor dimensionality in a more general situation than those found in the literature. Moreover, we obtain a near-optimal posterior concentration rate of the covariance matrix. Numerical studies are conducted and show the superiority of the proposed method compared with other competitors.


Posterior Consistency of the Silverman g-prior in Bayesian Model Choice

Neural Information Processing Systems

Kernel supervised learning methods can be unified by utilizing the tools from regularization theory. The duality between regularization and prior leads to interpreting regularization methods in terms of maximum a posteriori estimation and has motivated Bayesian interpretations of kernel methods. In this paper we pursue a Bayesian interpretation of sparsity in the kernel setting by making use of a mixture of a point-mass distribution and prior that we refer to as Silverman's g-prior.'' We provide a theoretical analysis of the posterior consistency of a Bayesian model choice procedure based on this prior. We also establish the asymptotic relationship between this procedure and the Bayesian information criterion.


Asymptotic Properties for Bayesian Neural Network in Besov Space

Lee, Kyeongwon, Lee, Jaeyong

arXiv.org Artificial Intelligence

Neural networks have shown great predictive power when dealing with various unstructured data such as images and natural languages. The Bayesian neural network captures the uncertainty of prediction by putting a prior distribution for the parameter of the model and computing the posterior distribution. In this paper, we show that the Bayesian neural network using spike-and-slab prior has consistency with nearly minimax convergence rate when the true regression function is in the Besov space. Even when the smoothness of the regression function is unknown the same posterior convergence rate holds and thus the spike-and-slab prior is adaptive to the smoothness of the regression function. We also consider the shrinkage prior, which is more feasible than other priors, and show that it has the same convergence rate. In other words, we propose a practical Bayesian neural network with guaranteed asymptotic properties.


Posterior Consistency for Bayesian Relevance Vector Machines

Fang, Xiao, Ghosh, Malay

arXiv.org Machine Learning

Statistical modeling and inference problems with sample sizes substantially smaller than the number of available covariates are challenging. Chakraborty et al. (2012) did a full hierarchical Bayesian analysis of nonlinear regression in such situations using relevance vector machines based on reproducing kernel Hilbert space (RKHS). But they did not provide any theoretical properties associated with their procedure. The present paper revisits their problem, introduces a new class of global-local priors different from theirs, and provides results on posterior consistency as well as posterior contraction rates.